Search CORE

27 research outputs found

Open-world Person Re-Identification by Multi-Label Assignment Inference.

Author: Cancela B
Gong S
Hospedales TM
Publication venue: BMVA Press
Publication date: 01/01/2014
Field of study

Crossref

Queen Mary Research Online

Identifying Rare and Subtle Behaviors: A Weakly Supervised Joint Topic Model

Author: Gong SG
Hospedales TM
Li J
Xiang T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2011
Field of study

Crossref

Queen Mary Research Online

Transductive Multi-View Zero-Shot Learning

Author: Fu Y
Gong S
Hospedales TM
Xiang T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2015
Field of study

arXiv.org e-Print Archive

CiteSeerX

Queen Mary Research Online

Fine-grained sketch-based image retrieval by matching deformable part models

Author: Gong S
Hospedales TM
Li Y
Song YZ
Publication venue
Publication date: 01/01/2014
Field of study

(c) 2014. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.© 2014. The copyright of this document resides with its authors. An important characteristic of sketches, compared with text, rests with their ability to intrinsically capture object appearance and structure. Nonetheless, akin to traditional text-based image retrieval, conventional sketch-based image retrieval (SBIR) principally focuses on retrieving images of the same category, neglecting the fine-grained characteristics of sketches. In this paper, we advocate the expressiveness of sketches and examine their efficacy under a novel fine-grained SBIR framework. In particular, we study how sketches enable fine-grained retrieval within object categories. Key to this problem is introducing a mid-level sketch representation that not only captures object pose, but also possesses the ability to traverse sketch and image domains. Specifically, we learn deformable part-based model (DPM) as a mid-level representation to discover and encode the various poses in sketch and image domains independently, after which graph matching is performed on DPMs to establish pose correspondences across the two domains. We further propose an SBIR dataset that covers the unique aspects of fine-grained SBIR. Through in-depth experiments, we demonstrate the superior performance of our SBIR framework, and showcase its unique ability in fine-grained retrieval

CiteSeerX

Queen Mary Research Online

Weakly Supervised Learning of Objects, Attributes and Their Associations

Author: Hospedales TM
Shi Z
Xiang T
Yang Y
Publication venue
Publication date: 01/01/2014
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-10605-2_31]”

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Surrey

Queen Mary Research Online

Surrey Research Insight

Learning Multimodal Latent Attributes

Author: Fu Y
Gong S
Hospedales TM
Xiang T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2014
Field of study

Abstract—The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning

CiteSeerX

Queen Mary Research Online

Multivariate Regression on the Grassmannian for Predicting Novel Domains

Author: Hospedales TM
IEEE
Yang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/04/2016
Field of study

This work was supported by EPSRC (EP/L023385/1), and the European Union’s Horizon 2020 research and innovation program under grant agreement No 640891

Crossref

Edinburgh Research Explorer

Queen Mary Research Online

Finding Rare Classes: Active Learning with Generative and Discriminative Models

Author: Gong S
Hospedales TM
Xiang T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/11/2016
Field of study

Crossref

Queen Mary Research Online

Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images

Author: Hospedales TM
Shi Z
Xiang T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2015
Field of study

Abstract—We address the problem of localisation of objects as bounding boxes in images and videos with weak labels. This weakly supervised object localisation problem has been tackled in the past using discriminative models where each object class is localised independently from other classes. In this paper, a novel framework based on Bayesian joint topic modelling is proposed, which differs significantly from the existing ones in that: (1) All foreground object classes are modelled jointly in a single generative model that encodes multiple object co-existence so that “explaining away ” inference can resolve ambiguity and lead to better learning and localisation. (2) Image backgrounds are shared across classes to better learn varying surroundings and “push out ” objects of interest. (3) Our model can be learned with a mixture of weakly labelled and unlabelled data, allowing the large volume of unlabelled images on the Internet to be exploited for learning. Moreover, the Bayesian formulation enables the exploitation of various types of prior knowledge to compensate for the limited supervision offered by weakly labelled data, as well as Bayesian domain adaptation for transfer learning. Extensive experiments on the PASCAL VOC, ImageNet and YouTube-Object videos datasets demonstrate the effectiveness of our Bayesian joint model for weakly supervised object localisation

CiteSeerX

Queen Mary Research Online

Discovery of Shared Semantic Spaces for Multiscene Video Query and Summarization.

Author: Gong S
Hospedales TM
Xu X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/07/2015
Field of study

The growing rate of public space CCTV installations has generated a need for automated methods for exploiting video surveillance data including scene understanding, query, behaviour annotation and summarization. For this reason, extensive research has been performed on surveillance scene understanding and analysis. However, most studies have considered single scenes, or groups of adjacent scenes. The semantic similarity between different but related scenes (e.g., many different traffic scenes of similar layout) is not generally exploited to improve any automated surveillance tasks and reduce manual effort. Exploiting commonality, and sharing any supervised annotations, between different scenes is however challenging due to: Some scenes are totally un-related -- and thus any information sharing between them would be detrimental; while others may only share a subset of common activities -- and thus information sharing is only useful if it is selective. Moreover, semantically similar activities which should be modelled together and shared across scenes may have quite different pixel-level appearance in each scene. To address these issues we develop a new framework for distributed multiple-scene global understanding that clusters surveillance scenes by their ability to explain each other's behaviours; and further discovers which subset of activities are shared versus scene-specific within each cluster. We show how to use this structured representation of multiple scenes to improve common surveillance tasks including scene activity understanding, cross-scene query-by-example, behaviour classification with reduced supervised labelling requirements, and video summarization. In each case we demonstrate how our multi-scene model improves on a collection of standard single scene models and a flat model of all scenes.Comment: Multi-Scene Traffic Behaviour Analysis ---- Accepted at IEEE Transactions on Circuits and Systems for Video Technolog

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Queen Mary Research Online